Classification of properties and their relation to chemical bonding: Essential steps toward the inverse design of functional materials

To design advanced functional materials, different concepts are currently pursued, including machine learning and high-throughput calculations. Here, a different approach is presented, which uses the innate structure of the multidimensional property space. Clustering algorithms confirm the intricate structure of property space and relate the different property classes to different chemical bonding mechanisms. For the inorganic compounds studied here, four different property classes are identified and related to ionic, metallic, covalent, and recently identified metavalent bonding. These different bonding mechanisms can be quantified by two quantum chemical bonding descriptors, the number of electrons transferred and the number of electrons shared between adjacent atoms. Hence, we can link these bonding descriptors to the corresponding property portfolio, turning bonding descriptors into property predictors. The close relationship between material properties and quantum chemical bonding descriptors can be used for an inverse material design, identifying particularly promising materials based on a set of target functionalities.


The Expert Classification:
The classical assignment of bonding types to each material is rather simple and unequivocal for metals, which are defined by a vanishing bandgap EG= 0 eV, (leading e.g. to high electrical conductivity at room temperature). For the other bonding types, no such absolute criterion exists. Ionic and covalent bonding describe two opposing concepts on how chemical bonds can be achieved in a solid. For ionic bonding, electrons are transferred from one atom to another, creating a negatively charged anion and a positively charged cation with a net charge difference. The electrostatic interaction between these ions then produces an attractive force which leads to bonding. For a covalent bond to form, electron pairs are formed between atoms, i.e. electrons are shared. This configuration lowers the total energy of the system and stabilizes the atomic configuration (It should be noted that only material properties were used for the classification, and not values of Electrons Transferred and Electrons Shared). While for an ionic bond two different elements constitute the bonding atoms (e.g. Na and Cl), a (pure) covalent bond can only be achieved by two identical elements (e.g. in diamond, consisting only of carbon atoms). However, for all non-monoatomic compounds, a certain amount of electrons is always transferred, while some are always shared. Hence covalent and ionic bonding contributions are always present to a varying degree, leaving some compounds to be ambiguous to which type they ultimately belong. These compounds are sometimes labeled as "ionocovalent" (e.g. ZnO). Metavalent bonding is characterized by low to moderate charge transfer in combination with an approximately half-filled p-band. While such a half-filled band would normally lead to a vanishing bandgap EG= 0, small distortions in the structure and/or the charge transfer break the symmetry of the system and open up a small bandgap. Metavalent compounds can therefore be thought of as "Incipient Metals" as well (19). This competition between localization and delocalization of the electrons creates a unique property portfolio of compounds utilizing MVB, including a high Born Effective Charge Z*, a high optical dielectrical constant ε∞ and a high Grüneisen parameter γTO, which is a measure of anharmonicity (softness) of the compounds. In a 'Gedanken'experiment another import criterion can be established. If there is a close relationship between distinct material properties and distinct bonding mechanisms, transitioning from one bonding mechanism to another should show discontinuous behavior of one or more properties. Such border transitions were investigated by Guarneri et al.(48), showing that discontinuous behavior indeed does exist between covalent and metavalent compounds. Unfortunately, such border transitions cannot be realized for all compounds and their existence for some systems does not strictly prove that they exist for all systems (and among all borders). The assignment of (and affection for) chemical bonding types by chemists and physicists might hence still be biased by engrained heuristic knowledge. By employing an a priori unbiased clustering algorithm, we will show that the concept of distinct bonding types holds true even for a purely data-driven approach. Table S1 shows the minimum and maximum values of each property within the respective bonding types, as assigned traditionally by the expert classification (within the database utilized for the classification).  Figure S1 shows the correlation of all properties with each other. While some of these plots shows pronounced correlations and novel insights, e.g. Z*+ or the band gap EG vs. the electrical conductivity, for other property combinations no clear correlation is visible by eye. This shortcoming is particularly pronounced if a certain property range is not characteristic for a certain bonding mechanism, as found for e.g. the melting temperature, the atomic density or the mass density.

Classification Metrics:
Different metrics can be utilized to evaluate the classification result of the Expectation Maximization algorithm (EM algorithm). Figure S2 shows the Average Log Likelihood (ALL) metric for different numbers of allowed clusters.

Figure S2: Average Log-Likelihood for 2-6 Clusters. Higher values indicate better results.
It can be observed that the classification quality increases monotonously with an increasing number of allowed clusters (higher values are better). This is expected, as adding an additional cluster is comparable to adding a degree of freedom, which is bound to improve the quality of any regression.
However, looking at the differences of the ALL values going from n to n+1 clusters, it becomes apparent that the change is quite large going from 2 to 3 and from 3 to 4 clusters, while it is less significant going from 4 to 5 and from 5 to 6. This indicates that the classification quality is improving substantially up to a number of 4 possible clusters.

Evolution of Cluster Formation:
The EM algorithm analysis of the material properties is purely based on a measure for the overall clustering quality (intra-cluster coherence) but it provides only little insight into the relative similarity between clusters (inter-cluster correlation). In order to assess this aspect as well, the EM clustering is redone with increasing numbers of Gaussian modes and track how clusters split, i.e., how the samples of a coarse cluster are distributed into finer clusters.  Figure S3 shows that for 2 clusters, metals and MVB compounds form a cluster and are separated from a cluster of ionic and covalent materials. For 3 possible clusters, metals and ionic compounds are separated, while Covalent and MVB materials are clustered together. 4 Clusters have already been discussed, and for 5 clusters the covalent and ionic group split of in three clusters.

Figure S3: Sequence of clustering results for different numbers of clusters, utilizing the T-distributed stochastic neighbor embedding (t-SNE
This shows that up until a number of three clusters, the traditional allocation of compounds according to chemistry is reproduced, with the MVB compounds being part of the covalent group. However, going to 4 clusters is not only numerically favorable (see figure S3), it flawlessly retrieves the materials which are assigned to be metavalent compounds. This notion of metavalently bonded systems being special is further underlined by the fact that the MVB materials were joined with the metals for two clusters, while changing to be joined with the covalent cluster for three clusters, meaning they are equally different from covalent and metallic systems alike. Hence putting them together with covalent or metallic systems when forced to is indifferent for the algorithm.

V. Using the quantum-chemical bonding Map to predict and design properties
An advantage of the ES/ET based map (see figure 6) is that navigation within this framework is achieved much easier. For example, a possible way to increase/decrease the ET value of a compound is to replace one constituent with another element from the same group of the periodic While the density of data points in figure 6 is not yet sufficient to inversely design a material by picking a property set of choice, we can show a proof of principle example with reduced complexity. The compound ZnS in wurtzite structure was excluded from the general dataset, as initially no value for the Born Effective Charge Z* was available. ES, ET, conductivity σ and band gap EG are known however, and a value for Z* was computed to complete the property set. We can therefore try to predict these properties of ZnS (Wurtzite) and compare it with the corresponding values from literature. In order to predict properties from ES/ET coordinates, the four closest compounds surrounding ZnS (Wurtzite) with higher/lower ES and ET respectively were chosen (ZnS, CdS, SnS, PbTe:0.2-Bi2Te3:0.8) and the values for conductivity σ and band gap EG are calculated by employing bilinear interpolation. As however PbTe:0.2-Bi2Te3:0.8 employs metavalent bonding, and we expect a non-continuous property behavior crossing bonding borders, this point was excluded from the interpolation.
The results are summarized in table S2:

VI. Full dataset of materials and properties in Data S1 (separate file)
All compounds with their respective classification results, properties and ES/ET values are listed in the separately attached CSV file: Data S1.csv.